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Levy processes, which have stationary independent increments, are ideal for 
modelling the various types of noise that can arise in communication channels. 
If a Levy process admits exponential moments, then there exists a parametric 
family of measure changes called Esscher transformations. If the parameter is 
replaced with an independent random variable, the true value of which represents 
a "message", then under the transformed measure the original Levy process takes 
on the character of an "information process". In this paper we develop a theory 
of such Levy information processes. The underlying Levy process, which we call 
the fiducial process, represents the "noise type". Each such noise type is capable 
of carrying a message of a certain specification. A number of examples are worked 
out in detail, including information processes of the Brownian, Poisson, gamma, 
variance gamma, negative binomial, inverse Gaussian, and normal inverse Gaussian 
type. Although in general there is no additive decomposition of information into 
signal and noise, one is led nevertheless for each noise type to a well-defined scheme 
for signal detection and enhancement relevant to a variety of practical situations. 

Key Words: Signal processing; Levy process; Esscher transformation; nonlinear 
filtering; innovations process; information process; cybernetics. 



I. INTRODUCTION 

The idea of filtering the noise out of a noisy message as a way of increasing its information 
content is illustrated by Norbert Wiener in his book Cybernetics (Wiener 1948) by means of 
the following example. The true message is represented by a variable X which has a known 
probability distribution. An agent wishes to determine as best as possible the value of X, 
but due to the presence of noise the agent can only observe a noisy version of the message 
of the form £ = X + e, where e is independent of X. Wiener shows how, given the observed 
value of the noisy message £, the original distribution of X can be transformed into an 
improved a posteriori distribution that has a higher information content. The a posteriori 
distribution can then be used to determine a best estimate for the value of X. 

The theory of filtering was developed in the 1940s when the inefficiency of anti-aircraft 
fire made it imperative to introduce effective filtering-based devices (Wiener 1949, 1954). 
A breakthrough came with the work of Kalman, who reformulated the theory in a manner 
more well-suited for dynamical state-estimation problems (Kailath 1974, Davis 1977). This 
period coincided with the emergence of the modern control theory of Bellman and Pontryagin 
(Bellman 1961, Pontryagin et al. 1962). Owing to the importance of its applications, 
much work has been carried out since then. According to an estimate of Kalman (1994), 
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over 200,000 articles and monographs had been published on applications of the Kalman 
filter alone. The theory of stochastic filtering, in its modern form, is not much different 
conceptually from the elementary example described by Wiener in the 1940s. The message, 
instead of being represented by a single variable, in the general setup can take the form 
of a time series (the "signal" or "message" process). The information made available to 
the agent also takes the form of a time series (the "observation" or "information" process), 
typically given by the sum of two terms, the first being a functional of the signal process, and 
the second being a noise process. The nature of the signal process can be rather general, 
but in most applications the noise is chosen to be a Wiener process (see, e.g., Liptser & 
Shiryaev 2000, Xiong 2008, Bain & Crisan 2010). There is no reason a priori, however, why 
an information process should be "additive" , or even why it should be given as a functional 
of a signal process and a noise process. From a mathematical perspective, it seems that 
the often proposed ansatz of an additive decomposition of the observation process is well- 
adapted to the situation where the noise is Gaussian, but is not so natural when the noise is 
discontinuous. Thus while a good deal of recent research has been carried out on the problem 
of filtering noisy information containing jumps (see, e.g., Rutkowski 1994, Ahn & Feldman 
1999, Meyer-Brandis & Proske 2004, Poklukar 2006, Popa & Sritharan 2009, Grigelionis & 
Mikulevicius 2011, and references cited therein), such work has usually been pursued under 
the assumption of an additive relation between signal and noise, and it is not unreasonable 
to ask whether a more systematic treatment of the problem might be available that involves 
no presumption of additivity and that is more naturally adapted to the mathematics of the 
situation. 

The purpose of the present paper is to introduce a broad class of information processes 
suitable for modelling situations involving discontinuous signals, discontinuous noise, and 
discontinuous information. No assumption is made to the effect that information can be 
expressed as a function of signal and noise. Instead, information processes are classified 
according to their "noise type". Information processes of the same noise type are then 
distinguished from one another by the messages that they carry. Each noise type is associated 
to a Levy process, which we call the fiducial process. The fiducial process is the information 
process that results for a given noise type in the case of a null message, and can be thought 
of as a "pure noise" process of that noise type. Information processes can then be classified 
by the characteristics of the associated fiducial processes. To keep the discussion elementary, 
we consider the case of a one-dimension fiducial process and examine the situation where 
the message is represented by a single random variable. The goal is to construct the optimal 
filter for the class of information processes that we consider in the form of a map that takes 
the a priori distribution of the message to an a posteriori distribution that depends on the 
information that has been made available. A number of examples will be presented. The 
results vary remarkably in detail and character for the different types of filters considered, 
and yet there is an overriding unity in the general scheme, which allows for the construction 
of a multitude of examples and applications. 

A synopsis of the main ideas, which we develop more fully in the remainder of the paper, 
can be presented as follows. We recall the idea of the Esscher transform as a change of 
probability measure on a probability space (i?, J 7 , Po) that supports a Levy process {£,t}t>o 
that possesses P -exponential moments. The space of admissible moments is the set A = 
{w G R : E p ° [exp(iu£ t )] < oo}. The associated Levy exponent ip(a) = t^ 1 lnE p °[exp(a^)] 
then exists for all a G A<& := {w G C : Rew G 4}, and does not depend on t. A 
parametric family of measure changes Pq — > Pa commonly called Esscher transformations 
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can be constructed by use of the exponential martingale family {pt}t>o, defined for each 
A G A by p\ = exp (A£ t — i/;(\)t). If is a P -Brownian motion, then {£ t } is P A -Brownian 
with drift A; if {^} is a P -Poisson process with intensity m, then {£ t } is P^-Poisson with 
intensity e A m; if {^} is a Po-gamma process with rate parameter m and scale parameter k, 
then {£ t } is PA-gamma with rate parameter m and scale parameter k/(1 — A). Each case 
is different in character. A natural generalisation of the Esscher transform results when 
the parameter A in the measure change is replaced by a random variable X. From the 
perspective of the new measure P^, the process {£ t } retains the "noisy" character of its Po- 
Levy origin, but also carries information about X. In particular, if one assumes that X and 
{£t} are P - independent, and that the support of X lies in A, then we say that {£ t } defines 
a Levy information process under Fx carrying the message X. Thus, the change of measure 
inextricably intertwines signal and noise. More abstractly, we say that on a probability space 
(i?, J 7 , P) a random process {£ t } is a Levy information process with message (or "signal") 
X and noise type (or "fiducial exponent") ipo(a) if {£t} is conditionally a P-Levy given X, 
with Levy exponent ipo{a + X) —ip (X) for a G C 1 := {w G C : Re w = 0}. We are thus able 
to classify Levy information processes by their noise type, and for each noise type we can 
specify the class of random variables that are admissible as signals that can be carried in 
the environment of such noise. We consider a number of different noise types, and construct 
explicit representations of the associated information processes. We also derive an expression 
for the optimal filter in the general situation, which transforms the a priori distribution of 
the signal to the improved a posteriori distribution that can be inferred on the basis of 
received information. 

The plan of the paper is as follows. In Section [TTJ after recalling some facts about 
processes with stationary and independent increments, we define Levy information, and in 
Proposition [1] we show that the signal carried by a Levy information process is effectively 
"revealed" after the passage of sufficient time. In Section II III we present in Proposition [2] 
an explicit construction using a change of measure technique that ensures the existence of 
Levy information processes, and in Proposition [3] we prove a converse to the effect that 
any Levy information process can be obtained in this way. In Proposition [4] we construct 
the optimal filter for general Levy information processes, and in Proposition we show 
that such processes have the Markov property. In Proposition O we establish a result that 
indicates in more detail how the information content of the signal is coded into the structure 
of an information process. Then in Proposition [7] we present a general construction of the 
so-called innovations process associated with Levy information. Finally in Section [IV] we 
proceed to examine a number of specific examples of Levy information processes, for which 
explicit representations are constructed in Propositions IHHT51 

II. LEVY INFORMATION 

We assume that the reader is familiar with the theory of Levy processes (Bingham 1975, 
Sato 1999, Appelbaum 2004, Bertoin 2004, Protter 2005, Kyprianou 2006). For an overview 
of some of the specific Levy processes considered later in this paper we refer the reader to 
Schoutens (2003). A real-valued process {£,t}t>o on a probability space (i7, J 7 , P) is a Levy 
process if: (i) P(£ = 0) = 1, (ii) {£ t } has stationary and independent increments, (iii) 
lim t _>.,, P(|£ t — £ s | > e) = 0, and (iv) {£ t } is almost surely cadlag. For a Levy process {£ t } to 
give rise to a class of information processes, we require that it should possess exponential 
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moments. Let us consider the set defined for some (equivalently for all) t > by 

A= {weR : E p [exp« t )] < 00} . (1) 

If A contains points other than w = 0, then we say that {£t} possesses exponential moments. 
We define a function ip : A — > R called the Levy exponent (or cumulant function) , such that 

E p [exp(«6)] =exp(V(a)*) (2) 

for a G A If a Levy process possesses exponential moments, then an exercise shows that 
"0(a) is convex on A, that the mean and variance of £t are given respectively by if)'(0) t and 
ijj"(0)t, and that as a consequence of the convexity of ijj(a) the marginal exponent ip'{ot) 
possesses a unique inverse I{y) such that /(■?//(«)) = a for a G A. The Levy exponent 
extends to a function ^ : A c — > C where A c = {id G C : Rew 6 A}, and it can be shown 
(Sato 1999, Theorem 25.17) that il>(ct) admits a Levy-Khintchine representation of the form 

^ a )= pa + \qa 2 + / (e az - 1 - azt{\z\ < l})v(dz) (3) 

with the property that ([2]) holds for for all a G Ao Here !{•} denotes the indicator function, 
pGl and g > are constants, and the so-called Levy measure u(dz) is a positive measure 
defined on R\{0} satisfying 

/ (1 A z 2 )u(dz) < 00. (4) 

If the Levy process possesses exponential moments, then for a G A we also have 

/ e Q2 l{|^| > l}v{dz) < 00. (5) 

ilR\{0} 

The Levy measure has the following interpretation: if B is a measurable subset of R\{0}, 
then v{B) is the rate at which jumps arrive for which the jump size lies in B. Consider 
the sets defined for n G IN by B n = {z G R | 1/n < \z\ < 1}. If v{B n ) tends to infinity for 
large n we say that is a process of infinite activity, meaning that the rate of arrival of 
small jumps is unbounded. If z/(R\{0}) < 00 one says that {^} has finite activity. We refer 
to the data K = (p, q, u) as the characteristic triplet (or "characteristic") of the associated 
Levy process. Thus we can classify a Levy process abstractly by its characteristic K, or, 
equivalently, its exponent ip(a). This means one can speak of a "type" of Levy noise by 
reference to the associated characteristic or exponent. 

Now suppose we fix a measure Po on a measurable space (i?, J 7 ), and let be Po-Levy, 
with exponent ipo(a). There exists a parametric family of probability measures {Pa}aga 011 
(i?, J 7 ) such that for each choice of A the process {£ t } is PA-Levy. The changes of measure 
arising in this way are called Esscher transformations (Esscher 1932, Gerber & Shiu 1994, 
Chan 1999, Kallsen & Shiryaev 2002, Hubalek & Sgarra 2006). Under an Esscher transfor- 
mation the characteristics of a Levy process are transformed from one type to another, and 
one can speak of a "family" of Levy processes interrelated by Esscher transformations. The 
relevant change of measure can be specified by use of the process {p^} defined for A G A by 



= exp(A&-^o(A)t), 



(6) 
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where T t = a [{£ s }o<s<* ]■ One can check that {p A } is an ({J^}, P ) -martingale: indeed, as 
a consequence of the fact that {^} has stationary and independent increments we have 

E P ° [p t A ] = E p ° [e A «<-^] e ^'-*^ W = p A (7) 

for s < t, where E t °[-] denotes conditional expectation under P with respect to J-f. It is 
straightforward to show that {£t} has PA-stationary and independent increments, and that 
the P A -exponent of {£*}, which is defined on the set := {w 6 C|Rew + A 6 A}, is given 
by 

Va(«) := lnE P Mexp(«6)] = ^o(« + A) - ^o(A), (8) 

from which by use of the Levy-Khintchine representation ([3]) one can work out the character- 
istic triplet K\ of {£(} under P A . We observe that if the Esscher martingale (jSJ) is expanded as 
a power series in A, then the resulting coefficients, which are given by polynomials in £ t and 
t, form a so-called Sheffer set (Schoutens & Teugels 1998), each element of which defines an 
({J-'f}, P )-martingale. The first three of these polynomials take the form Q 1 (x, t) = x — ip't, 
Q 2 (x,t) = \[{x - ip't) 2 - iP"t], and Q 3 (x,t) = ±[(x - ip't) 3 - 3ip"t(x - ip't) - ip"% where 
ip' = ip' (0), ip" = ip'o(0), and ip'" = ^q'(O). The corresponding polynomial Levy-Sheffer 
martingales are given by Q\ = Q 1 ^,*), Qt = Q 2 (£t,t), and Q 3 t = Q 3 (£ t ,t). 

In what follows we use the terms "signal" and "message" interchangeably. We write 
C 1 = {w G C : Rew = 0}. For any random variable Z on (]?, J 7 , P) we write F z = cr[Z], 
and when it is convenient we write E p [ • \ Z] for E p [ • | T z \. For processes we use both of the 
notations {Z t } and {Z(t)}, depending on the context. 

With these background remarks in mind, we are in a position to define a Levy information 
process. We confine the discussion to the case of a "simple" message, represented by a 
random variable X. In the situation when the noise is Brownian motion, the information 
admits a linear decomposition into signal and noise. In the general situation the relation 
between signal and noise is more subtle, and has the character of a fibre space, where one 
thinks of the points of the base space as representing the different noise types, and the points 
of the fibres as corresponding to the different information processes that one can construct 
in association with a given noise type. Alternatively, one can think of the base as being 
the convex space of Levy characteristics, and the fibre over a given point of the base as the 
convex space of messages that are compatible with the associated noise type. 

We fix a probability space (i?, J 7 , P), and an Esscher family of Levy characteristics K\, 
X E A, with associated Levy exponents ip\(a), a G A@. We refer to K as the fiducial 
characteristic, and iPq{ol) as the fiducial exponent. The intuition here is that the abstract 
Levy process of characteristic K$ and exponent ipo(a), which we call the "fiducial" process, 
represents the noise type of the associated information process. Thus we can use Kq, or 
equivalently ipo( a )-> t° label the noise type. 

Definition 1 By a Levy information process with fiducial characteristic Kq, carrying the 
message X , we mean a random process {£t}, together with a random variable X , such that 
{£t} is conditionally Kx-Levy given J zX . 

Thus, given T x we require to have conditionally independent and stationary increments 
under P, and to possess a conditional exponent of the form 

ip x (a) := r 1 lnE p [exp(a&) | F x ] = ^{a + X) - ^(X) (9) 
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for a G C 1 , where 4>o(a) is the fiducial exponent of the specified noise type. It is implicit 
in the statement of Definition [1] that a certain compatibility condition holds between the 
message and the noise type. For any random variable X we define its support Sx to be 
the smallest closed set F with the property that F(X G F) = 1. Then we say that X is 
compatible with the fiducial exponent ipo(a) if Sx C A. Intuitively speaking, the compat- 
ibility condition ensures that we can use X to make a random Esscher transformation. In 
the theory of signal processing, it is advantageous to require that the variables to be esti- 
mated should be square integrable. This condition ensures that the conditional expectation 
exists and admits the interpretation as a best estimate in the sense of least squares. For 
our purpose it will suffice to assume throughout the paper that the information process is 
square integrable under P. This in turn implies that ip'{X) is square integrable, and that 
ip"(X) is integrable. Note that we do not require that the Levy information process should 
possess exponential moments under P, but a sufficient condition for this to be the case is 
that there should exist a nonvanishing real number e such that A + e G A for all A G Sx- 

To gain a better understanding of the sense in which the information process {^} actually 
"carries" the message X, it will be useful to investigate its asymptotic behaviour. We write 
Io{y) for the inverse marginal fiducial exponent. 

Proposition 1 Let {C, t } be a Levy information process with fiducial exponent ipo(a) and 
message X . Then for every e > we have 

limF[\I (r%)-X\ >e] =0. (10) 



Proof. It follows from Q that ip x (®) = ^oPO; anc ^ hence that at any time t the conditional 
mean of the random variable t~ l £ t is given by 

e p [t-%\r x ]=t&(X). 

A calculation then shows that the conditional variance of £ -1 £t takes the form 



Var p [t~% | T x ] := 
which allows us to conclude that 



E 



x 



1 



0. 



(11) 
(12) 

(13) 
(14) 

(15) 
(16) 

and it follows that /o(£ _1 &) converges to X in probability. □ 

Thus we see that the information process does indeed carry information about the mes- 
sage, and in the long run "reveals" it. The intuition here is that as more information is 
gained we improve our estimate of X to the point that the value of X eventually becomes 
known with near certainty. 



and hence that 

lim E p [(r^-^(X)) ; 

t— ¥oo L 

On the other hand for all e > we have 

p[ \r% - ^(x)| > e ] < i e f [(t-% - ^pf)) : 

by Chebychev's inequality, from which we deduce that 

lim F[\r%-tP' (X)\>e[ 



0. 



t— ¥OC 
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III. PROPERTIES OF LEVY INFORMATION 

It will be useful if we present a construction that ensures the existence of Levy information 
processes. First we select a noise type by specification of a fiducial characteristic K . Next 
we introduce a probability space (Q, T , F ) that supports the existence of a Po-Levy process 
with the given fiducial characteristic, together with an independent random variable X 
that is compatible with Kq. 

Write {Ft} for the filtration generated by {£*}, and {Q t } for the filtration generated by 
{£t} and X jointly: Q t = cr[{£ t }o< s <t, X]. Let i^o(a) be the fiducial exponent associated with 
K Q . One can check that the process {pf} defined by 

pf = exp(X6-^o(X)t) (17) 

is a ({Qt}, Po)-martingale. We are thus able to introduce a change of measure P — » F x on 
(tf,.F,Po) by setting 



dP 



= Pf- (18) 

Qt 

It should be evident that is conditionally P^-Levy given T x , since for fixed X the 
measure change is an Esscher transformation. In particular, a calculation shows that the 
conditional exponent of £ t under P^ is given by 

r 1 \nE ¥x [exp(a&) | J^ x ] = ^ (a + X) - ^(X) (19) 

for a G C 1 , which shows that the conditions of Definition [1] are satisfied, allowing us to 
conclude the following: 

Proposition 2 The F -Levy process {£ t } is a F x -Levy information process, with message 
X and noise type ipo(a). 

In fact, the converse also holds: if we are given a Levy information process, then by a 
change of measure we can find a Levy process and an independent "message" variable. Here 
follows a more precise statement. 

Proposition 3 Let {£ t } be a Levy information process on a probability space (i?, J 7 , P) with 
message X and noise type ipo(a). Then there exists a change of measure P — > Po such that 
{£t} and X are F^-independent, {^} is F^-Levy with exponent ipoi *) > an d the probability law 
of X under F Q is the same as probability law of X under P. 

Proof. First we establish that the process {pf} defined by the expression pf = exp(— X£ t + 
ipo(X)t) is a ({Gt}, P)-martingale. We have 

FF\p? \g s ) = fF [ex P (-x& + Mx)t) I g s ] 

= E p [exp(-X(6 - Q)\g s ) exp(-X£ s + i> Q (X)t) 

= exp{t/) X (-X)(t - s)) exp(-X£ s + ( 20 ) 

by virtue of the fact that is ^^-conditionally Levy under P. By use of Qj we deduce 
that ij)x{—X) = —ipo(X), and hence that E p [p^|(? s ] = pf, as required. Then we use {pf} 
to define a change of measure P — > F on (i7, T , P) by setting 

dP 



dP 



= Pt- (21) 

Qt 
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To show that £ t and X are Po-independent for all t, it suffices to show that their joint 
characteristic function under P factorises. Letting a, (3 G C , we have 

E p ° [exp« t + PX)} = E p [exp(-X& + VoPO*) ex PK* + $X)\ 

= E p [E p [exp((-X + a)£ t + $ (X)t + PX)\F X }] 

= E p [exp(^ x (-X + a)t + ip ( X )t + PX)\ 

= exp(^ («)t)E p [exp(/3X)], (22) 

where the last step follows from (J9j). This argument can be extended to show that {£ t } and 
X are Po-independent. Next we observe that 

E p °[exp(a(f u -&) + /?&)] 

= E p [exp(-Xf u + MX)u + a(£ u - &) + /?&) ] 

= E p [E p [exp(-X^ + Vo(X) M + a(£ u - 6) + Kt) \F X }] 

= E p [E p [exp(^ (X) M + (a - X)(£ u - £ t ) + (f3 - X)£ t ) \F X \] 

= E p [exp(^ (X) M + ij x (a -X)(u-t) + i> x (P - X)t) ] 

= exp(^ (a)(M - t)) exptyoW) (23) 

for u > t > 0, and it follows that £ u — £ t and £j are independent. This argument can be 
extended to show that {£ t } has Po-independent increments. Finally, if we set a = in f[2"2"j) 
it follows that the probability laws of X under Po and P are identical; if we set /3 = in 
(}2"2"]) it follows that the P exponent of is ^(cOi an d if we set /3 = in (1251) it follows 
that is Po-stationary. □ 

Going forward, we adopt the convention that P always denotes the "physical" measure 
in relation to which an information process with message X is defined, and that Po denotes 
the transformed measure with respect to which the information process and the message 
decouple. Therefore, henceforth we write P rather than Fx- In addition to establishing the 
existence of Levy information processes, the results of Proposition [5J provide useful tools 
for calculations, allowing us to work out properties of information processes by referring 
the calculations back to Po- We consider as an example the problem of working out the 
.^-conditional expectation under P of a ^-measurable integrable random variable Z. The 
P-expectation of Z can be written in terms of Po-expectations, and is given by a "generalised 
Bayes formula" (Kallianpur & Striebel 1968) of the form 

E p o \pfZ I Jil 

E '^'l = W (24) 

This formula can be used to obtain the T%- conditional probability distribution function for 
X, defined for y 6 R by 

F t x (y)=F(X<y\F t ). (25) 
In the Bayes formula we set Z = t{X < y}, and the result is 



J t{x < y} exp (x£ t — ipo(x)t) dF Xi 
j exp (x£t ~ i/>o(x)t) dF x (x) 



x) 
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where F x (y) = F(X < y) is the a priori distribution function. It is useful for some purposes 
to work directly with the conditional probability measure 7r t (dx) induced on R defined by 
dF x (x) = ir t (dx). In particular, when X is a continuous random variable with a density 
function p(x) one can write vr t (dx) = p t (x)dx, where pt{x) is the conditional density function. 

Proposition 4 Let {^} be a Levy information process under P with noise type ipo(a), and 
let the a priori distribution of the associated message X be 7r(dx). Then the Ft- conditional 
a posteriori distribution of X is 

f i x exp {xi t - i/> (x)t) 
7r t (dx) = -j. — , . . . — — -^7r(dx). (27) 



It is straightforward to establish by use of a variational argument that for any function 
/ : R — > R, such that the random variable Y = f(X) is integrable, the best estimate for Y 
conditional on the information T% is given by 

Y t := E P [Y | JS] = J f(x) 7T t (dx). (28) 

By the "best estimate" for Y we mean the ^-measurable random variable Y t that minimises 
the quadratic error E p [(y — l^) 2 !^]- 

It will be observed that at any given time t the best estimate can be expressed as a 
function of £t and t, and does not involve values of the information process at times earlier 
than t. That this should be the case can be seen as a consequence of the following: 

Proposition 5 The Levy information process {£ t } has the Markov property. 

Proof. For the Markov property it suffices to establish that for a G R we have 

P(6<a|j;)=P(&<a|^), (29) 

where 7 t = er[ {£ s }o< s <t ] and 7^ = er[£ t ]. We write 

$ 4 := E Po [pf|JFj = J exp (x£ t - if, (x)t) vr(dx), (30) 

where pf is defined as in equation ( TT7|) . It follows that 

E Po [$ t l{6<a}|^ s ] 



P(6<a|^)=E r [l{6<a}|^ 



(31) 



since has the Markov property under the transformed measure Pq. □ 

We note that since X is J r 00 -measurable, which follows from Proposition [TJ the Markov 
property implies that if Y — f(X) is integrable we have 

E ¥ [Y\F t ] =E p [y|j^]. (32) 
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This identity allows one to work out the optimal filter for a Levy information process by 
direct use of the Bayes formula. It should be apparent that simulation of the dynamics of 
the filter is readily approachable on account of this property. 

We remark briefly on what might appropriately be called a "time consistency" property 
satisfied by Levy information processes. It follows from (1271) that, given the conditional 
distribution 7r s (dx) at time s < t, we can express n t (dx) in the form 



Then if for fixed s > we introduce a new time variable u := t — s, and define r] u = — £ s , 
we find that {r] u } u > is an information process with fiducial exponent ipo(a) and message X 
with a priori distribution n s (dx). Thus given up-to-date information we can "re-start" the 
information process at that time to produce a new information process of the same type, 
with an adjusted message distribution. 

Further insight into the nature of Levy information can be gained by examination of 
expression fl9]) for the conditional exponent of an information process. In particular, as a 
consequence of the Levy-Khintchine representation ([3]) we are able to deduce that 



for a G C , which leads to the following: 

Proposition 6 The randomisation of the F Q -Levy process {£ t } achieved through the change 
of measure generated by the randomised Esscher martingale p t = exp(X£ t — if) (X)t) induces 
two effects on the characteristics of the process: (i) a random shift in the drift term, given 
by 



and (ii) a random rescaling of the Levy measure, given by z/(dz) — > e Xz u(dz). 

The integral appearing in the shift in the drift term is well defined since the term z(e Xz — 1) 
vanishes to second order at the origin. It follows from Proposition [6] that in sampling an 
information process an agent is in effect trying to detect a random shift in the drift term, 
and a random "tilt" and change of scale in the Levy measure, altering the overall rate as 
well as the relative rates at which jumps of various sizes occur. It is from these data, within 
which the message is encoded, that the agent attempts to estimate the value of X. It is 
interesting to note that randomised Esscher martingales arise in the construction of pricing 
kernels in the theory of finance (see, e.g., Shefrin 2008, Macrina & Parbhoo 2011). 

We turn to examine the properties of certain martingales associated with Levy informa- 
tion. We establish the existence of a so-called innovations representation for Levy infor- 
mation. In the case of the Brownian filter the ideas involved are rather well understood 
(see, e.g., Liptser & Shiryaev 2000), and the matter has also been investigated in the case 
of Poisson information (Segall & Kailath 1975). These examples arise as special cases in 
the general theory of Levy information. Throughout the discussion that follows we fix a 
probability space (J?, J 7 , P). 




(33) 




R\{0} 



(34) 




(35) 
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Proposition 7 Let be a Levy information process with fiducial exponent ipo( a ) an d 
message X , let {J-'t} denote the filtration generated by let Y — V'oC^O; where if>' ( a ) is 
the marginal fiducial exponent, and set Y t = E p [y^]. Then the process {M t } defined by 

& = f Y u du + M t (36) 
Jo 

is an ({ J-'t}, P) -martingale. 

Proof. We recall that {£t} is by definition J rX -conditionally P-Levy. It follows therefore 
from (EEH) that E P [&|X] = Yt, where Y = i>' (X). As before we let {Q t } denote the filtration 
generated jointly by and X. First we observe that the process defined for t > by 
m t = £ t — Yt is a {{Qt}, P)-martingale. This assertion can be checked by consideration of 
the one-parameter family of ({Qt}, Po)-martingales defined by 

pf +e = exp ((X + e)& - MX + e)t) (37) 

for e G C 1 . Expanding this expression to first order in e, we deduce that the process defined 
for t > by pf(£t ~ V'oC^OO is a ({(^}, P )-martingale. Thus we have 

E Po [pf (6 - ^oW*) I &] = Pf (6 - (38) 
Then using {p^} to make a change of measure from Po to P we obtain 

E p [6 - VoPO* I &] = 6 - ^oW«, ( 39 ) 
and the result follows if we set Y = ip' (X). Next we introduce the "projected" process {m t } 
defined by m t = E p [m t \ J-'t}- We note that since {m t } is a ({Qt}, P)-martingale we have 

E p [m 4 |j;] = E F [£ t -Yt\ F a ] 

= E v [E P lZt-Yt\Q s ]\F s } 
= E p [£ s . - Ys | JT S ] 

= rhs, (40) 
and thus is an ({Ft}, P)-martingale. Finally we observe that 



E p [M t \^ 



E 



6 



K. d-u 



&[tt\r.. 



E f 



K.dw 



Kdu, (41) 



where we have made use of the fact that the final term is Fg-measurable. The fact that 
{rh t } and {Yt} are both (J-'t, P)-martingales implies that 

-t 



E p [£ t | F s ] - e. = (t - s)Y s = E 1 



Y, du 



J 7 , 



(42) 



□ 



from which it follows that E p [M t \ J- s ] = M s , which is what we set out to prove. 

Although the general information process does not admit an additive decomposition into 
signal and noise, it does admit a linear decomposition into terms representing (i) information 
already received and (ii) new information. The random variable Y entering via its conditional 
expectation into the first of these terms is itself in general a nonlinear function of the message 
variable X. It follows on account of the convexity of the fiducial exponent that the marginal 
fiducial exponent is invertible, which ensures that X can be expressed in terms of Y by the 
relation X = Iq(Y), which is linear if and only if the information process is Brownian. Thus 
signal and noise are deeply intertwined in the case of general Levy information. Vestiges of 
linearity remain, and these suffice to provide an overall element of tractability. 
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IV. EXAMPLES OF LEVY INFORMATION PROCESSES 

In a number of situations one can construct explicit examples of information processes, 
categorised by noise type. The Brownian and Poisson constructions, which are familiar in 
other contexts, can be seen as belonging to a unified scheme that brings out their differences 
and similarities. We then proceed to construct information processes of the gamma, the 
variance gamma, the negative binomial, the inverse Gaussian, and the normal inverse 
Gaussian type. It is interesting to take note of the diverse nature of noise, and to observe 
the many different ways in which messages can be conveyed in a noisy environment. 

Example 1: Brownian information. On a probability space (]?, J 7 , P), let {B t } be a 
Brownian motion, let X be an independent random variable, and set 

£t = Xt+ B t . (43) 

The random process {£t} thereby defined, which we call the Brownian information process, is 
J-^-conditionally i^x-Levy, with conditional characteristic Kx = {X, 1, 0) and conditional 
exponent ipx{ a ) — Xa> + \o? ■ The fiducial characteristic is K = (0,1,0), the fiducial 
exponent is ipo(&) = |« 2 , and the associated fiducial process or "noise type" is standard 
Brownian motion. In the case of Brownian information, there is a linear separation of the 
process into signal and noise. This model, considered by Wonham (1965), is perhaps the 
simplest continuous-time generalisation of the example described by Wiener (1948). The 
message is given by the value of X, but X can only be observed indirectly, through {£t}. 
The observations of X are obscured by the noise represented by the Brownian motion {B t }. 
Since the signal term grows linearly in time, whereas \B t \ ~ y/t, it is intuitively plausible that 
observations of {£ t } will asymptotically reveal the value of X, and a direct calculation using 
properties of the normal distribution function confirms that converges in probability 

to X; this is consistent with Proposition [1] if we note that ip' {<y) = a and Io(y) = y in the 
Brownian case. 

The best estimate for X conditional on J-j is given by (128]) . which can be derived by use 
of the generalised Bayes formula ff24|) . In the Brownian case there is an elementary method 
leading to the same result, worth mentioning briefly since it is of interest. First we present 
an alternative proof of Proposition [5] in the Brownian case that uses a Brownian bridge 
argument. 

We recall that if s > s\ > then B s and s~ 1 B s — s^ 1 -^ are independent. More 
generally, we observe that if s > Si > S2, then B s , s~ 1 B s — s^ 1 B Sl , and s± B S1 — s' 2 1 B S2 are 
independent, and that s~ l ^ s — s± £ Sl = s -1 S s — sjf B Sl . Extending this line of reasoning, 
we see that for any a G R we have 

P(£t<a|£ s ,£ Sl ,...,&J 



since £ t an d £ s are independent of s~ 1 B s — Si B S1 , . . . , sjt_i5 afc _ 1 — s~ k l B Sk , and that gives 
us the Markov property ( 1291 . Since we have established that X is J r 00 -measurable, it follows 
that ( 1321) holds. As a consequence, the a posteriori distribution of X can be worked out by 



P I & < a 

P Ut < a 



S Si 



(44) 
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use of the standard Bayes formula, and for the best estimate of X we obtain 

f x exr>(xf t — hx 2 t) Tc(dx) , 

Xt = r — ( \ 12,; n r (45) 

J exp(x£, t — 2 X t) ftydx) 

The innovations representation ( |36|) in the case of a Brownian information process can be 
derived by the following argument. We observe that the {{J r t}, Po)-niartingale defined 
in (13 Op is a "space-time" function of the form 

$ t := E ¥o [p t \T t ] = J exp (x£ t - h 2 ?j ir(dx). (46) 

By use of the Ito calculus together with (14"51) . we deduce that d$ 4 = X t $ t d£ t , and thus by 
integration we obtain 

$ 4 = exp Qf X,d£ s - i jT X s 2 d^ . (47) 

Since {^} is an ({J-^}, P )-Brownian motion, it follows from (j4"T|) by the Girsanov theorem 
that the process {M t } defined by 

& = [ X s ds + M t (48) 
J o 

is an ({J-" 4 }, P)-Brownian motion, which we call the innovations process (see, e.g., Heunis 
2011). The increments of {M t } represent the arrival of new information. 

We conclude our discussion of Brownian information with the following remarks. In 
problems involving prediction and valuation, it is not uncommon that the message is revealed 
after the passage of a finite amount of time. This is often the case in applications to finance, 
where the message takes the form of a random cash flow at some future date, or, more 
generally, a random factor that affects such a cash flow. There are also numerous examples 
coming from the physical sciences, economics and operations research where the goal of an 
agent is to form a view concerning the outcome of a future event by monitoring the flow 
of information relating to it. How does one handle problems involving the revelation of 
information over finite time horizons? 

One way of modelling finite time horizon scenarios in the present context is by use of a 
time change. If {^} is a Levy information process with message X and a specified fiducial 
exponent, then a generalisation of Proposition [1] shows that the process {£tr} defined over 
the time interval < t < T by 

^~«(J1_) (49) 
reveals the value of X in the limit as t — > T, and one can check that 

Cov [U, 6t I F X ] = <P0, (0 < s < t < T). (50) 



In the case where {£ t } is a Brownian information process represented as above in the form 
£t = Xt + B t , the time-changed process ( |4"9|) takes the form £ tT = Xt + (3 tT , where {fitr} is 
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a Brownian bridge over the interval [0, T\. Such processes have had applications in physics 
(Brody & Hughston 2005, 2006; see also Adler et al. 2001, Brody & Hughston 2002) and 
in finance (Brody et al. 2007, 2008a, Rutkowski & Yu 2007, Brody et al. 2009, Filipovic et 
al. 2012). It seems reasonable to conjecture that time-changed Levy information processes 
of the more general type proposed above may be similarly applicable. 

Example 2: Poisson information. Consider a situation in which an agent observes a 
series of events taking place at a random rate, and the agent wishes to determine the rate as 
best as possible since its value conveys an important piece of information. One can model the 
information flow in this situation by a modulated Poisson process for which the jump rate is 
an independent random variable. Such a scenario arises in many real-world situations, and 
has been investigated in the literature (Segall & Kailath 1975, Segall et al. 1975, Bremaud 
1981, Di Masi & Runggaldier 1983, Kailath & Poor 1998). The Segall-Kailath scheme can 
be seen to emerge naturally as an example of our general model for Levy information. 

As in the Brownian case, one can construct the relevant information process directly. On 
a probability space (i?, J 7 , P), let {N(t)} t >o be a standard Poisson process with jump rate 
m > 0, let A be an independent random variable, and set 

& = N(e x t). (51) 

Thus is a time-changed Poisson process, and the effect of the signal is to randomly 
modulate the rate at which the process jumps. It is evident that {£ t } is J-^-conditionally 
Levy and satisfies the conditions of Definition [TJ In particular, 

E [exp (aN(e x t)) | F x ] = exp (me x (e a -l)t), (52) 

and for fixed A one obtains a Poisson process with rate me x . It follows that f[5"Tj) is an 
information process. The fiducial characteristic is given by K = (0, 0, mdi(dz)), that of a 
Poisson process with unit jumps at the rate m, where 5i(dz) is the Dirac measure with unit 
mass at z — 1, and the fiducial exponent is ipo(ot) = m(e a — 1). A calculation using 
shows that Kx = (0, 0, me x Si(dz)), and that ij)x{oi) = me x (e a — 1). The relation between 
signal and noise in the case of Poisson information is rather subtle. The noise is associated 
with the random fluctuations of the inter-arrival times of the jumps, whereas the message 
determines the average rate at which the jumps occur. 

It will be instructive in this example to work out the conditional distribution of X by 
elementary methods. Since X is J r 00 -measurable and {£ t } has the Markov property, we have 

F x (y) := F{X < y | JF t ) = P(X < y | &) (53) 

for i/ el. It follows then from the Bayes law for an information process taking values in 
N that 

ft{x < w}P(& = n I X = x)dF x (x) , N 

P A < y = n = J 1 - y t ^ 1 —L 1 ] . 54 

J P(4( = n\X = x) dr A (x) 

In the case of Poisson information the relevant conditional distribution is 

P(& = n I A = x) = exp(-mte x ) ( mte p n . (55) 

n\ 
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After some cancellation we deduce that 



J t{x < y} exp(xn — m(e x — l)t) dF 



x, 



X) 



nX <y\i t = n) = J r a) ^ - \ >> v \ (56) 

J exp(xn — m[e x — l)t) dr A (x) 



and hence 



lX J t{x < y}exp(x& - m(e x - l)t) dF x (x) 



Ff ^ Jexp(x£ t -m(e* -l)t)dF x (x) ' (5?) 



and thus 



»,(d») = 7'^-;t'-y>, . ,(d,), (58) 

J exp(x4* — m(e x — ljr) 7r(dx) 

which we can see is consistent with (1271) if we recall that in the case of noise of the Poisson 
type the fiducial exponent is given by ipo{ a ) = rn{e a — 1)- 

If a Geiger counter is monitored continuously in time, the sound that it produces pro- 
vides a nice example of a Poisson information process. The crucial message (proximity to 
radioactivity) carried by the noisy sputter of the instrument is represented by the rate at 
which the clicks occur. 

Example 3: Gamma information. It will be convenient first to recall a few definitions 
and conventions (cf. Yor 2007, Brody et al. 2008b, Brody et al. 2012). Let m and k be 
positive numbers. By a gamma process with rate m and scale k on a probability space 
(i?, J 7 , P) we mean a Levy process {~it}t>$ with exponent 

t" 1 lnE p [exp(«7t)] = -mln(l - no) (59) 

for a G Ate = {w G C | Rew < k^ 1 }. The probability density for j t is 

P( 7t G dx) = t{x > 0} X r[ ^ 1 X/ ] dx, (60) 

where T[a] is the gamma function. A short calculation making use of the functional equation 
F[a + 1] = aT[a] shows that E p [j t ] = rant and Var p [jt] = mn 2 t. Clearly, the mean and 
variance determine the rate and scale. If k — 1 we say that {74} is a standard gamma 
process with rate m. If k 7^ 1 we say that { / ~f t } is a scaled gamma process. The Levy 
measure associated with the gamma process is 

u(dz) = t{z > 0} m z~ x exp(— kz) dz. (61) 

It follows that i/(R\{0}) = 00 and hence that the gamma process has infinite activity. Now 
tet {£*} De a standard gamma process with rate m on a probability space (Q, J 7 , Fq), and 
let A G R satisfy A < 1. Then the process {p£} defined by 

^ = (l - A)"V 7t (62) 

is an ({Tt}, Po) -martingale. If we let {p£} act as a change of measure density for the 
transformation Po — > Pa, then we find that {74} is a scaled gamma process under Pa, with 
rate m and scale 1/(1 — A). Thus we see that the effect of an Esscher transformation on a 
gamma process is to alter its scale. With these facts in mind, one can establish the following: 
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Proposition 8 Let {7 t } be a standard gamma process with rate m on a probability space 
(i?, J 7 , P) ; and let the independent random variable X satisfy X < 1 almost surely. Then 
the process {^} defined by 

& = l4x 74 (63) 

is a Levy information process with message X and gamma noise, with fiducial exponent 
V'o(tt) = _ m hi(l — ot) for a G {w G C | Re w < 1}. 

Proof. It is evident that {£ t } is .^-conditionally a scaled gamma process. As a consequence 
of f[5"9"j) we have 



-lnE p [exp(a&)|X] 1 In E r 

t t 

for a G C 1 . Then we note that 



cxp 



orft 
-X 



X 



Mi-^) (CD 



- mln 1 1 - - = -m In (1 - (X + a)) + mm (1 - X) . (65) 

It follows that the J rX - conditional P exponent of {^} is ipo{X + a) — ipa(X). □ 

The gamma filter arises as follows. An agent observes a process of accumulation. Typi- 
cally there are many small increments, but now and then there are large increments. The 
rate at which the process is growing is the figure that the agent wishes to estimate as ac- 
curately as possible. The accumulation can be modelled by gamma information, and the 
associated filter can be used to estimate the growth rate. It has long been recognised that 
the gamma process is useful in describing phenomena such as the water level of a dam or the 
totality of the claims made in a large portfolio of insurance contracts (Gani 1957, Kendall 
1957, Gani & Pyke 1960). Use of the gamma information process and related bridge pro- 
cesses, with applications in finance and insurance, is pursued in Brody et al. (2008b), Hoyle 
(2010), and Hoyle et al. (2011). We draw the reader's attention to Yor (2007) and references 
cited therein, where it is shown how certain additive properties of Brownian motion have 
multiplicative analogues in the case of the gamma process. One notes in particular the 
remarkable property that 7t and 'js/lt are independent for t > s > 0. Making use of this 
relation, it will be instructive to present an alternative derivation of the optimal filter for 
gamma noise. We begin by establishing that the process defined by ( 16 3 p has the Markov 
property. We observe first that for any times t > s > s% > s 2 > • • • > s& the variables 
ls 1 /ls,ls 2 /ls 1 , and so on, are independent of one another and are independent of 7 S and j t - 
It follows that 

P (6 < a|&, ■ ■ ■ , & J = P (& < - X)~ l ls , . . . , (1 - xrsj 



P 6 < a 



Is 7s i 7s fc _i 
= P(&<a|6), (66) 

since {74} and X are independent, and that gives us (|29"1) . In working out the distribution 
of X given J~t it suffices therefore to work out the distribution of X given £ t . We note that 
the Bayes formula implies that 

7r t (dx) = - 7T dx , 67 

J p(€t\X = x vr dx 
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where 7r(dx) is the unconditional distribution of X, and p(£\X = x) is the conditional density 
for the random variable £ 4 , which can be calculated as follows: 

p(£|* = x) = ^ P(6 < Z\x = x) = A P ((i _ < e|x = x) 

= ^ P (7* < (1 - * = *) = 5 1 rr gJ , i e • (68) 

d£ T [mt\ 

It follows that the optimal filter in the case of gamma noise is given by 

(a \ (1 -x) m 'exp(x&) (A v 

Kt[dx) = - r — — — — 7r(dx), (69) 

J ^^l - a;)" 1 * exp(a;4 t )7r(dx) 

We conclude with the following observation. In the case of Brownian information, it is well 
known (and implicit in the example of Wiener 1948) that if the signal is Gaussian, then the 
optimal filter is a linear function of the observation £ f . One might therefore ask in the case 
of a gamma information process if some special choice of the signal distribution gives rise 
to a linear filter. The answer is affirmative. Let U be a gamma-distributed random variable 
with the distribution 

P( U G du) = t{u > 0} ^"'rcPt-N dUj (70) 

where r > 1 and 9 > are parameters, and set X = 1 — U. Let {£ t } be a gamma information 
process carrying message X, let Y = ip' (X) = m/ (1 — X), and set r = (r — l)/m. Then the 
optimal filter for Y is given by 

Y t :=¥F[Y\7 t ] = ^. (71) 



Example 4: Variance-gamma information. The so-called variance-gamma or VG pro- 
cess (Madan & Seneta 1990, Madan & Milne 1991, Madan et at 1998) was introduced in the 
theory of finance. The relevant definitions and conventions are as follows. By a VG process 
with drift /x G R, volatility a > 0, and rate m > 0, we mean a Levy process with exponent 

ip(a) = —m In ( 1 — — a a 2 

\ m 2m 

The VG process admits representations in terms of simpler Levy processes. Let {7 t } be a 
standard gamma process on (J?, J 7 , P), with rate m, as defined in the previous example, and 
let {B t } be a standard Brownian motion, independent of {74}. We call the scaled process 
{r t } defined by F t = m~ 1/ ~f t a gamma subordinator with rate m. Note that r t has dimensions 
of time and that E p [T t ] = t. A calculation shows that the Levy process {Vt} defined by 

V t = nT t + aB Ft (73) 

has the exponent fl72|) . The VG process thus takes the form of a Brownian motion with 
drift, time-changed by a gamma subordinator. If \i = and a = 1, we say that {V t } is a 
"standard" VG process, with rate parameter m. If \i 7^ 0, we say that {V t } is a "drifted" 
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VG process. One can always choose units of time such that m = 1, but for applications 
it is better to choose conventional units of time (seconds for physics, years for economics), 
and treat m as a model parameter. In the limit a — > we obtain a gamma process with 
rate m and scale /z/m. In the limit m-^oowe obtain a Brownian motion with drift /x and 
volatility a. 

An alternative representation of the VG process results if we let {7*} and {7^} be inde- 
pendent standard gamma processes on (J2, J 7 , P), with rate m, and set 

V t = /?i7t - k 2 7* 2 , (74) 

where K\ and k 2 are nonnegative constants. A calculation shows that the exponent is of the 
form (1721) . In particular, we have 

ip(a) = —m In (l — (k\ — n 2 ) a — k\k 2 a 2 ) , (75) 

where /1 = m(«i — k 2 ) and cr 2 = 2mK, 1 K 2 , or equivalently 

Ki = -J— (fi + vV 2 + 2mff 2 ) and k 2 = - — ( — u + a/a* 2 + 2maA , (76) 
2m V / 2m \ ) 

where a G {w G C : — 1/k, 2 <Rew < Now let {^} be a standard VG process on 

(i?, J 7 , P ), with exponent ipo(a) = — mln(l — {2m)' 1 a 2 ) for a G {w G C : |Reio| < V2m}. 
Under the transformed measure Pa defined by the change-of-measure martingale (jSD, one 
finds that is a drifted VG process, with 

^(l-J-tf)-' md -(l-^A 2 )" (77) 

for |A| < \/2m. Thus in the case of the VG process an Esscher transformation affects both 
the drift and the volatility. Note that for large m the effect on the volatility is insignificant, 
whereas the effect on the drift reduces to that of an ordinary Girsanov transformation. 

With these facts in hand, we are now in a position to construct the VG information 
process. We fix a probability space (i?, J 7 , P) and a number m > 0. 

Proposition 9 Let {r t } be a standard gamma subordinator with rate m, let {B t } be an 
independent Brownian motion, and let the independent random variable X satisfy \X\ < 
\/2m almost surely. Then the process {£t} defined by 

^ x { i -^ x2 y lr,+ i i -^ x T B{r,) <78) 

is a Levy information process with message X and VG noise, with fiducial exponent 

^o(ot) = -mln ( 1 - -^—a 2 ) (79) 



2m 



for a G {w G C : Re w < \/2m} . 
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Proof. Observe that {£ t } is J-" x -conditionally a drifted VG process of the form 

Zt = ftxr t + (TxB(r t ), (so) 

where the drift and volatility coefficients are 

"* = *( 1 -s;*T a,ld " x = { l -h x T- (81) 

The J-" x -conditional P-exponent of {^} is by ( 172~I) thus given for a 6 C 1 by 

/ 1 1 2 - 

ipx(ot) = — mm 1 ux a a x ot 

\ m 2m 

/// hi I I - —X (l - J—X 2 } 1 a- — (l-J-X 2 ) <r 



m \ 2m J 2m \ 2m 

= - mln ( 1 -i (x+a)2 ) +roln ( 1 "^ x2 )' (82) 

which is evidently by (1791) of the form ?/>o(V + a) — ^oPO, as required. □ 

An alternative representation for the VG information process can be established by the 
same method if one randomly rescales the gamma subordinator appearing in the time- 
changed Brownian motion. The result is as follows. 

Proposition 10 Let {T t } be a gamma subordinator with rate m, let {B t } be an independent 
standard Brownian motion, and let the independent random variable X satisfy \X\ < y2m 
almost surely. Write } for the subordinator: 

r ^( i -L x2 ) lp '- (83) 

Then the process {^} defined by £ t = XF^ + B(T^) is a VG information process with 
message X . 

A further representation of the VG information process arises as a consequence of the 
representation of the VG process as the asymmetric difference between two independent 
standard gamma processes. In particular, we have: 

Proposition 11 Let {7/} and {7 t 2 } be independent standard gamma processes, each with 
rate m, and let the independent random variable X satisfy \X\ < y2m almost surely. Then 
the process {£ t } defined by 

^vd^^-vd^"' (84) 

is a VG information process with message X . 
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Example 5: Negative-binomial information. By a negative binomial process with rate 
parameter m and probability parameter q, where m > and < q < 1, we mean a Levy 
process with exponent 

^(a) = mln(V— ^rl (85) 



1 — ge c 

for a G {u> G C | Kew < — Inq}. There are two representations for the negative binomial 
process (Kozubowski & Podgorski 2009; Brody at al. 2012). The first of these is a compound 
Poisson process for which the jump size J G IN has a logarithmic distribution 

P (J = w) = - 1 -q n , (86) 
ln(l — q) n 

and the intensity of the Poisson process determining the timing of the jumps is given by 
A = — mln(l — q). One finds that the characteristic function of J is 

O («) := E p °[exp(aJ)] = (87) 

for a G {w G C | Re w < — In q}. Then if we set 

oo 

n t = J2Hk<N t }J k , (88) 
fc=i 

where {N t } is a Poisson process with rate A, and { Jk}km denotes a collection of independent 
identical copies of J, representing the jumps, one deduces that 

p °<«' - - AVIV ' 1 " ,r ' (89) 

and that the resulting exponent is given by (l85p . The second representation of the negative 
binomial process makes use of the method of subordination. We take a Poisson process 
with rate A = mq/(l — q), and time-change it using a gamma subordinator {r t } with rate 
parameter m. The moment generating function thus obtained, in agreement with ( |85|) . is 

\mt 

T^qh) ■ (90) 

With these results in mind, we fix a probability space (i?, J 7 , P) and find: 

Proposition 12 Let {r t } be a gamma subordinator with rate m, let {N t } be an independent 
Poisson process with rate m, let the independent random variable X satisfy X < — In q almost 
surely, and set 

Then the process defined by 

& = N(lf) (92) 

is a Levy information process with message X and negative binomial noise, with fiducial 
exponent f )85|) . 
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Proof. This can be verified by direct calculation. For a G C 1 we have: 



E p [e^|X] = E ¥ [exp(aN(r t x ))\X]=E I 



exp m 



qe 



qe x (e Q - r 



1 — qe 



x 



-mt 



1 — qe 



1 — qe x 

x x mt 



X 



qe 



X+a 



which by (1851) shows that the conditional exponent is ipo(X + a) — ipo(X). 



(93) 
□ 



There is also a representation for negative binomial information based on the compound 
Poisson process. This can be obtained by an application of Proposition [6J which shows 
how the Levy measure transforms under a random Esscher transformation. In the case of a 
negative binomial process with parameters m and q, the Levy measure is given by 



v(dz) = m V — q n 8 n (d 
z — ' n 



(94) 



n=l 



where 5 n (dz) denotes the Dirac measure with unit mass at the point z — n. The Levy 
measure is finite in this case, and we have v(H) = — mln(l — q), which is the overall rate at 
which the compound Poisson process jumps. If one normalises the Levy measure with the 
overall jump rate, one obtains the probability measure (186]) for the jump size. With these 
facts in mind, we fix a probability space (/?, J 7 , P) and specify the constants m and q, where 
m > 1 and < q < 1. Then as a consequence of Proposition 6 we have the following: 

Proposition 13 Let the random variable X satisfy X < — In q almost surely, let the random 
variable J x have the conditional distribution 



F{J X = n\X) 



ln(l — qe 



X'< 



X\n 



n 



(95) 



{Jk}kem be a collection of conditionally independent identical copies of J x , and let {N t } 
be an independent Poisson process with rate m. Then the process {£ t } defined by 



6 = £ 1{* < N(- ln(l - qe x )t)} J x 



(96) 



k=l 



is a Levy information process with message X and negative binomial noise, with fiducial 
exponent ( 185]) . 



Example 6: Inverse Gaussian information. The inverse Gaussian (IG) distribution 
appears in the study of the first exit time of Brownian motion with drift (Schrodingier 
1915). The name "inverse Gaussian" was introduced by Tweedie (1945), and a Levy process 
whose increments have the IG distribution was introduced in Wasan (1968). By an IG 
process with parameters a > and b > 0, we mean a Levy process with exponent 

^o(a) = a(b- Vb 2 - 2a) (97) 
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for a G {w G C | < Re w < \b 2 }. Let us write {G t } for the IG process. The probability 
density function for G t is 

¥ (G t G dx) = t{x > 0} -fi= exp f_ ( bx - at ) 2 \ dX) (98) 



and we find that E p °[Gt] = at/b and that Var p °[G t ] = at/b 3 . It is straightforward to check 
that under the Esscher transformation P — > P\ induced by (]6]), where < A < |6 2 , the 
parameter a is left unchanged, whereas b — > (b 2 — 2A) 1 / 2 . With these facts in mind we are 
in a position to introduce the associated information process. We fix a probability space 
(f2, J 7 , P) and find the following: 



Proposition 14 Let G(t) be an inverse Gaussian process with parameters a and b, let X 
be an independent random variable satisfying < X < |6 2 almost surely, and set Z = 
b^ib 2 — 2X) 1 / 2 . Then the process {^} defined by 

it = Z- 2 G(Zt) (99) 



is a Levy information process with message X and inverse Gaussian noise, with fiducial 
exponent f|9T|) . 

Proof. It should be evident by inspection that {^} is J-" x -conditionally Levy. Let us there- 
fore work out the conditional exponent. For a G C 1 we have: 



E p [exp(a6)l^] 



E 



X 



exp (a^^G (b-iVP^tXtf) 
= exp (at (Vb 2 -2X- ^b 2 - 2jc7+X))\ 

= exp [at (b - ^b 2 - 2(a + X)) - at (b - Vb 2 - 2X)) , (100) 
which shows that the conditional exponent is of the form ipo(a + X) — ip (X). □ 



Example 7: Normal inverse Gaussian information. By a normal inverse Gaussian 
(NIG) process (Rydberg 1997, Barndorff-Nielsen 1998) with parameters a, b, and m, such 
that a > 0, |6| < a, and m > 0, we mean a Levy process with an exponent of the form 



^ (a) = m Na 2 - b 2 - ^ a 2 - (b + a) 2 (101) 



for a G {w G C : —a — b < Rew < a — b}. Let us write {I t } for the NIG process. The 
probability density for its value at time t is given by 



amtK\ (ay/m 2 t 2 + x 2 ) / /— \ 

'o(/t G dx) = = exp I mty a 2 — b 2 + bx) dx, 

Ti\/m 2 t 2 + x 2 - ' 



(102) 



where K v is the modified Bessel function of third kind (Erdelyi 1953). The NIG process can 
be represented as a Brownian motion subordinated by an IG process. In particular, let {B t } 
be a standard Brownian motion, let {G t } be an independent IG process with parameters a' 



23 



and b', and set a' = 1 and V = m(a 2 — b 2 ) 1 ^ 2 . Then the characteristic function of the process 
{It} defined by 

I t = bm 2 G t + mB(G t ) (103) 

is given by (110 ip . The associated information process is constructed as follows. We fix a 
probability space (J2, J 7 , P) and the parameters a, b, and m. 

Proposition 15 Let the random variable X satisfy —a — b < X < a — b almost surely, let 
{Gf} be F x -conditionally IG, with parameters a' = 1 and b' = m(a 2 — (b + X) 2 ) 1 ^ 2 , and let 
F t = m 2 Gf . Then the process {£ t } defined by 

tt = (b + X)F t + B(F t ) (104) 

is a Levy information process with message X and NIG noise, with fiducial exponent (UOip . 

Proof. We observe that the condition on {Gf} is that 

^lnE p [exp (aGf) \X] = 5^Ja 2 - {b + X) 2 - \/m 2 {a 2 - {b + X) 2 ) - 2a (105) 

for a G C 1 . Thus if we set ipx( a ) — E p [exp(a£ t )|X] for a G C 1 it follows that 

*IJx(a) = E p [exp(a(b + X)F t + aB(F t ))\X] 

= E p [exp ((a(6 + X) + \a 2 )m 2 G?) \ X] (106) 

= E p exp (mt^/a 2 - (b + X) 2 - mt^a? - {b + X) 2 - 2 (a(6 + X) + \a 2 ) J 

which shows that the conditional exponent is of the required form. □ 

Similar arguments lead to the construction of information processes based on various 
other Levy processes related to the IG distribution, including for example the generalised 
hyperbolic process (Barndorff-Nielsen 1977), for which the information process can be shown 
to take the form 

Zt=(b + X)G t + B{G t ). (107) 

Here the random variable X is taken to be P-independent of the standard Brownian motion 
{B(t)}, and {G t } is ^^-conditionally a generalised IG process with parameters (5, (a 2 — (b + 
X) 2 ) 1 ^ 2 ,^). It would be of interest to determine whether explicit models can be obtained 
for information processes based on the Meixner process (Schoutens & Teugels 1998) and the 
CGMY process (Carr et al. 2002, Madan & Yor 2008). 

We conclude this study of Levy information with the following remarks. Recent devel- 
opments in the phenomenological representation of physical (Brody & Hughston 2006) and 
economic (Brody et al. 2008a) time series have highlighted the idea that signal processing 
techniques may have far-reaching applications to the identification, characterisation and cat- 
egorisation of phenomena, both in the natural and in the social sciences, and that beyond 
the conventional remits of prediction, filtering, and smoothing there is a fourth and impor- 
tant new domain of applicability: the description of phenomena in science and in society. 



24 



It is our hope therefore that the theory of signal processing with Levy information herein 
outlined will find a variety of interesting and exciting applications. 
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and by the Fields Institute, University of Toronto. The authors are grateful to N. Bingham, M. Davis, E. 
Hoyle, M. Grasselli, T. Hurd, S. Jaimungal, E. Mackie, A. Macrina, P. Parbhoo and M. Pistorius for helpful 
comments and discussions. 
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